Clustering methods with dimension reduction have been receiving considerablewide interest in statistics lately and a lot of methods to simultaneouslyperform clustering and dimension reduction have been proposed. This workpresents a novel procedure for simultaneously determining the optimal clusterstructure for multivariate binary data and the subspace to represent thatcluster structure. The method is based on a finite mixture model ofmultivariate Bernoulli distributions, and each component is assumed to have alow-dimensional representation of the cluster structure. This method can beconsidered an extension of the traditional latent class analysis model.Sparsity is introduced to the loading values, which produces thelow-dimensional subspace, for enhanced interpretability and more stableextraction of the subspace. An EM-based algorithm is developed to efficientlysolve the proposed optimization problem. We demonstrate the effectiveness ofthe proposed method by applying it to a simulation study and real datasets.
展开▼